Conversation
Introduces a new MemoryProvider plugin implementing Hermes V3 long-term
memory design (two-tier: hot episodes + cold curated semantic_facts,
weekly human-approved promotion).
W1 scope is schema only — no read or write path yet:
- plugins/memory/sqlite_vec/{__init__,store,plugin.yaml,schema.sql}
- episodes table (hot raw turn record, channel-scoped idempotent)
- semantic_facts table (cold curated, with valid_from/valid_to validity
windows borrowed from the MemPalace temporal-triple pattern)
- vec_facts vec0 virtual table (512-dim float32) + 3 sync triggers
- SqliteVecMemoryProvider class registers with MemoryProvider ABC
but prefetch/sync_turn are no-ops until W2/W3 wire them.
Tests (7/7 passing inside running hermes container):
- bootstrap creates all expected tables/indexes/triggers
- bootstrap is idempotent
- semantic_facts column defaults populate (created_at, valid_from)
- role CHECK constraint rejects values other than user/assistant
- triggers keep vec_facts in sync on insert/update/delete
- vec0 MATCH+k returns nearest neighbour
- provider lifecycle round-trips
Activates via $HERMES_HOME/config.yaml memory.provider: sqlite_vec
(deferred; W4 cutover only).
Refs liyoungc/hermes-memory#2 (W1-1)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements the read path for the sqlite_vec memory plugin per docs/superpowers/specs/2026-05-02-hermes-memory-design.md §4. embed.py: async voyage_embed() with httpx.AsyncClient, 128-text batching, 3x exponential-backoff retry on 5xx, fail-loud on missing VOYAGE_API_KEY or 4xx. dim/dtype locked to spec values (512/int8) so config drift fails fast. read.py: Fact dataclass + async read_memory() using vec0 prefilter (k=50) and the SQL CTE rerank with locked weights 0.7*sim + 0.3*recency (90-day half-life). bump_hits() is fire-and-forget UPDATE that swallows sqlite errors with a warning. p95 latency logged as JSON line to ~/.hermes/logs/memory.log. W1 schema fix: vec_facts changed from FLOAT[512] to int8[512] to match spec §1.4 (Voyage 3.5-lite, 512-dim, int8). vec0 int8 columns require the vec_int8() SQL wrapper on INSERT, and reject UPDATE entirely even with the wrapper, so sf_after_update_embedding now does DELETE+INSERT. Tests: 10 new cases (mock httpx for voyage success/batching/5xx-retry/ 4xx/missing-key/empty-input; read_memory orders by score and filters expired; bump_hits increments and swallows errors; format_facts shape). 17/17 green. Refs liyoungc/hermes-memory#4
scripts/import_md.py seeds semantic_facts from ~/.hermes/memories/MEMORY.md
per spec §6.1. Each "Topic: content §"-delimited entry maps to one
semantic_fact row with entity prefix "禮揚." plus a slug of the topic,
importance=2, valid_from=2026-05-10, valid_to=NULL. Hierarchical topics
like "Tools & Access > ProtonMail" become entity
"禮揚.tools_access.protonmail" so prefix queries still work.
Embeds in batches of 128 via Voyage 3.5-lite. Idempotent: pre-INSERT
(entity, fact) lookup skips duplicates so re-runs are safe. Wraps the
batch-insert in BEGIN/COMMIT and rolls back on embed failure so partial
imports never land. Supports --dry-run for preview and --commit for the
real write.
W1 schema fix bundled: vec0 column now declares distance_metric=cosine.
Without this, the default L2 distance on int8 vectors produces sim
values in the hundreds, breaking the 0.7*sim + 0.3*recency rerank
formula entirely. Verified end-to-end on chococlaw:
Q: "我太太生日" -> top hit "**生日**: 3/19" sim=0.604 OK
Q: "AI as digital twin" -> top hit "Think of AI as a digital twin"
sim=0.607 OK
Tests: 12 new cases for import_md (slugify simple/hierarchy/CJK/empty;
parse colon-missing/no-trailing-§; dry-run no-write; commit populates
vec_facts via trigger; idempotent re-run; partial update embeds only
new; rollback on embed failure leaves DB unchanged). 29/29 green
including W1 + W2-1.
Live import: 25 entries, 1 Voyage batch, all visible in semantic_facts
and vec_facts on chococlaw:/opt/data/memories/memory.db.
Refs liyoungc/hermes-memory#5
SqliteVecMemoryProvider.prefetch() now embeds the user message via
Voyage 3.5-lite, runs read_memory() (vec0 prefilter k=50, SQL CTE
rerank with cosine sim + 90-day half-life), and returns a markdown
block:
## Recent relevant memories
- [entity.slug] fact text (importance: N, age: D days)
...
Activation is via config.yaml (memory.provider: sqlite_vec) — no env
var gate. Per spec §4 the persona files (SOUL.md, USER.md,
life-dimensions.md) stay in flat-file injection above this block; the
gateway's existing prompt assembler handles ordering.
Hits accounting (spec §4): retrieved fact IDs are stashed per
session_id. sync_turn() runs bump_hits() on the cached IDs *after* the
reply is delivered, so the UPDATE never sits on the user-facing
latency path. Errors are swallowed.
Async-in-sync bridge: the ABC's prefetch/sync_turn are sync, but the
gateway already owns the asyncio loop, so asyncio.run inline raises.
Solution is a worker thread with its own event loop and a 5s timeout
kill-switch. To make sqlite3 cross-thread access legal, the connection
opens with check_same_thread=False and self._lock serializes both
read_memory and bump_hits. open_db()/init_db() now take a keyword-only
check_same_thread param (default True; provider passes False).
format_facts_for_prompt() gained a with_meta=True flag that appends
"(importance: N, age: D days)" per fact, used by prefetch. /memdebug
will keep the compact (with_meta=False) form.
Tests: 6 new cases (markdown header, empty/trivial query no-op,
voyage error swallow, sync_turn bumps then clears cache, worker
timeout, with_meta format). 35/35 green including W1, W2-1, W2-2.
Live activation verified on chococlaw:
config.yaml memory.provider: '' -> sqlite_vec
docker compose restart gateway
Memory provider 'sqlite_vec' registered (0 tools)
sqlite_vec memory ready at /opt/data/memories/memory.db
End-to-end via MemoryManager.prefetch_all() against the real DB:
"我太太生日" returns the full 8-fact markdown block top-1 = "**生日**: 3/19".
Refs liyoungc/hermes-memory#6
plugins/memdebug/ is a standalone plugin that registers the /memdebug
slash command via the hermes-agent ctx.register_command() surface.
Memory plugins live in plugins/memory/ and load through the exclusive
loader, which doesn't pass through the slash-command registry — keeping
/memdebug separate is the cleanest split.
Behaviour (spec §7.2):
/memdebug -> short usage help
/memdebug <query> -> top-8 from semantic_facts with
score + sim + age + importance breakdown
/memdebug rawsearch <query> -> substring scan of episodes (forensics)
Each invocation logs to ~/.hermes/logs/memory.log as a JSON line so the
F2 monitoring path (% top-1 hits judged useful) can aggregate weekly.
Reaction logging deferred: the issue acceptance criterion calls for
👍/👎 reaction prompts on the embed message, but Discord-native rich
embeds + reaction collectors require gateway-side plumbing
(gateway/platforms/discord.py) that the spec §8 marks as iterate-after-W2
work. v1 emits a textual "React 👍/👎 to flag this retrieval." cue and
relies on manual user reactions for now.
LOG_PATH bug fix bundled: both this plugin and plugins/memory/sqlite_vec/
were resolving the log path via Path.home(), which inside the hermes
container resolves to /home/hermes — not the /opt/data mount. Switched
to hermes_constants.get_hermes_home() so logs land in the mounted
~/.hermes/logs/memory.log on the host. Confirmed live:
$ tail -2 ~/.hermes/logs/memory.log
{"ts": "2026-05-02T13:06:17", "q": "今晚晚餐", "k": 8, "n": 8, "sql_ms": 2.81}
{"ts": "2026-05-02T13:06:17", "cmd": "memdebug", "q": "今晚晚餐", "n": 8, "ids": [...]}
Also fixed a Python default-arg gotcha: _open_memory_db(path=DEFAULT_DB)
bound DEFAULT_DB at def-time so monkeypatching the module global didn't
take effect. Switched to lazy lookup (path = path or DEFAULT_DB).
Tests: 10 new for memdebug (truncate, help/empty/rawsearch-no-arg,
semantic with score breakdown, db-missing friendly message,
rawsearch finds substring, rawsearch empty, sync entry-point dispatch,
register() wires the right name + handler shape). 45/45 green
including W1, W2-1, W2-2, W2-3.
Live verification on chococlaw:
/memdebug -> help text
/memdebug 我太太生日 -> top-1 = "**生日**: 3/19" (sim=0.604)
/memdebug rawsearch 致妤 -> "Episodes are written by W3" (placeholder)
Refs liyoungc/hermes-memory#7
plugins/memory/sqlite_vec/extract.py implements the per-turn
extraction stage of the write path.
EXTRACT_PROMPT is a verbatim copy of spec §5.2 (HARD RULES 1-4 +
JSON shape contract); paraphrasing here would compromise the F2
monitoring contract that downstream weekly review depends on.
PHI_BLACKLIST_CHANNELS = {"cmio", "cbme", "medicine"} short-circuits
to [] before any network call so hospital data never round-trips
through synthetic.new.
kimi_extract(user, assistant, channel, ts) calls Kimi K2.5 via
synthetic.new's OpenAI-compatible endpoint with temperature=0.1,
response_format=json_object, max_tokens=1024. Token usage is logged
to ~/.hermes/logs/memory.log so weekly review can spot a runaway
extract budget.
JSON parser is intentionally tolerant: in live testing Kimi K2.5
returned three different shapes for the same prompt at temperature=0.1:
1. bare list [{...}]
2. wrapped object {"analysis": "...", "extracted_memories": [...]}
3. flat single fact {"type":"episodic","text":"...","entity":...}
_parse_json_list() handles all three, falls back to the first
list-valued field, and detects single-fact dicts by canonical key
presence.
Credential resolution: SYNTHETIC_API_KEY env var first (test override),
then auth.json's credential_pool["custom:synthetic"] (canonical key on
chococlaw). Older / alternate layouts (credential_pools, top-level)
also accepted for resilience.
Coercion drops malformed rows (bad type / blank text / unparseable
importance), clamps importance to 1-5, and validates entity / valid_to_hint
types. Only well-formed facts reach the caller.
Tests: 22 cases (prompt verbatim assertions, PHI blacklist (3),
parser shapes (5), coercion (3), short-circuits (2), mocked
synthetic.new full flow (5), error paths (2), auth.json round-trip).
213/213 green across all memory + scripts tests.
Live smoke test on chococlaw against real synthetic.new + Kimi K2.5:
pleasantry ("好的") -> 0 facts ✓
long-lived ("追 sleep RCT") -> 1 fact (semantic, 禮揚.研究興趣) ✓
phi-channel ("cmio") -> 0 facts (short-circuit) ✓
short-lived ("致妤 7:30") -> 0 facts ⚠ (Kimi judges "about 致妤,
not about 禮揚")
The short-lived miss is a spec-level prompt issue, not an extract.py
bug — the prompt says "memories about 禮揚" and Kimi reads that
strictly. Spec §4.1's B1 acceptance example expects this turn to
extract; matching B1 will require a spec edit (e.g. clarifying
"about 禮揚 includes 禮揚's life context"). W3-3 weekly_promotion
runs a separate thinking-mode Kimi pass over a week of episodes,
which is the spec's intended catch for hot-path misses.
Refs liyoungc/hermes-memory#8
plugins/memory/sqlite_vec/write.py implements the per-turn write-back
half of the memory system per spec §5.1.
Hot-path flow:
1. PHI gate — channel in PHI_BLACKLIST_CHANNELS short-circuits
extract (raw episode rows still land; the LLM never sees PHI).
2. kimi_extract returns ExtractedFact list (or [] on failure;
non-fatal — raw turn is still recorded so weekly_promotion can
re-extract later).
3. voyage_embed batches the user msg, reply, and every fact text
in one Voyage call. Empty strings are filtered out so we don't
waste a Voyage slot.
4. INSERT 2 rows into episodes (user, assistant) inside a single
BEGIN/COMMIT, with ON CONFLICT(channel, external_id) DO NOTHING
for idempotent Discord redelivery / cron-retry / restart-replay.
5. Per-fact partition into fast-track vs stash:
* valid_to_hint parses to <= today + 30 days -> INSERT
into semantic_facts directly (the trigger mirrors into
vec_facts so the next turn's prefetch can retrieve it).
* everything else -> JSON-stash in episodes.metadata.stashed_facts
for W3-3 weekly_promotion.
6. Any exception -> rollback + append the turn (raw text, ts,
channel, msg_id, error) to ~/.hermes/logs/memory_write_failures.jsonl.
The reply was already sent; we never propagate the error.
Threshold rationale (spec §5.3): raised from the original 7d to 30d so
short-lived facts ("下週會去日本玩五天") don't sit in metadata for a
week before the next Sunday review fires.
Provider wiring (plugins/memory/sqlite_vec/__init__.py):
sync_turn() now schedules two worker-thread coroutines after the
reply lands: bump_hits (5s budget) and write_episode (30s budget).
The thread reuses self._lock so cross-thread sqlite3 access remains
serialized. msg_id is synthesized by hashing
(session_id, user, assistant, ts-to-the-minute) so Discord
redeliveries within the same minute collapse via ON CONFLICT.
No env-var gate (matches W2-3): activation is the same
config.yaml memory.provider: sqlite_vec. Rolling back the write path
specifically would require code change (or temporarily clearing the
provider config), but the hot-path failure mode is a JSONL log entry,
not a stalled reply, so the rollback risk is low.
Tests: 11 new (parse_valid_to_hint edge cases, fast-track threshold
edge / interior / over / null, two episode rows per turn, PHI skips
extract but records, idempotent dup msg_id, short-lived fast-tracks +
mirrors to vec_facts, long-lived stashes in metadata, mixed
partition, embed failure -> JSONL + rollback, extract failure still
records raw, empty turn no embed call). 205/205 green across all
memory + memdebug + import tests.
Live verification on chococlaw:
Turn A: "今晚致妤大概 7:30 才到家" / "了解"
-> 2 episodes, 0 facts (Kimi judged "about 致妤 not 禮揚",
same prompt-wording observation logged in W3-1)
Turn B: "我下週會去日本玩五天" / "酷..."
-> 2 episodes, 1 fact fast-tracked:
(.家庭) "下週會去日本玩五天" valid_from=2026-05-02 valid_to=2026-05-11
-> vec_facts auto-mirrored via trigger (semantic_facts 25 -> 26).
-> Kimi correctly inferred valid_to from "下週" + "五天".
Cleanup: smoke test data deleted from production DB before commit.
Refs liyoungc/hermes-memory#9
Implements the cold-path of the memory system per spec §5.3 + §5.4.
Two scripts (entry points in ~/.hermes/scripts/):
scripts/weekly_promotion.py - cron Sun 03:00 UTC+8 (cron expr "0 19 * * 6"
in UTC). Reads last 7 days of pending episodes, runs one Kimi call to
produce a promotion diff, persists the diff to
~/.hermes/memories/pending_diffs/wk-YYYY-MM-DD.json, renders the digest
markdown per spec §5.4, posts it to #memory-review via raw Discord HTTP.
Does NOT stamp episodes.promoted_at.
scripts/weekly_apply.py - cron Mon 03:00 UTC+8 ("0 19 * * 0" UTC).
Purges pending_diffs/*.json older than 14 days at start. Loads the
latest pending diff. If a <digest_id>.rejected sentinel file exists
(written by /memreview reject in W3-4), archives the diff as rejected
and exits. Otherwise applies promote / dedup / expire atomically and
stamps episodes.promoted_at on the candidate rows.
Both scripts emit a final stdout line {"wakeAgent": false} so the cron
framework's wake gate skips the agent run — delivery is handled inside
the script via the Discord HTTP POST helper, no LLM round-trip needed
for the cron job itself.
Core logic lives in plugins/memory/sqlite_vec/promotion.py:
- PROMOTION_PROMPT designed to mirror EXTRACT_PROMPT style: same
HARD RULES (PHI blacklist, pleasantry filter, synthetic handling,
err-on-side-of-not-promoting), four explicit actions
(PROMOTE / DEDUP_HIT / EXPIRE / DROP_AS_NOISE), and a verbatim
output schema.
- Per-candidate vec_search prefilter k=20 keeps the prompt small
(only nearest-neighbor existing facts, not the whole active set,
so prompt stays bounded as semantic_facts grows past 500 rows).
- WeekDigest dataclass round-trips JSON, render_digest_markdown
matches spec §5.4 layout (Promote / Dedup / Expire / Noise sections,
emoji icons, character-truncated chunks for Discord 2000-char limit).
- discord_post chunks long messages on newline boundaries before 1990
chars to stay under Discord's per-message ceiling.
- memory_review_channel_id resolves the live channel from
~/.hermes/channel_directory.json (which stores platforms.discord
as a list of {id, name, guild, type} dicts on chococlaw).
Critical refactor: _apply_diff_atomic embeds promote-fact texts BEFORE
opening the BEGIN/COMMIT, then writes blobs into the transaction.
Holding the writer lock open across a Voyage HTTP round-trip would
block hot-path write_episode for the duration of the call (300ms+).
Live verification on chococlaw:
Inserted 4 fixture episodes -> weekly_promotion -> Kimi call:
Kimi-K2-Thinking 404'd on synthetic.new; auto-fallback to K2.5.
Returned: 2 promote, 0 dedup, 0 expire, 1 drop_as_noise.
weekly_apply applied diff: promoted=2 stamped=4
semantic_facts: 25 -> 27 (then back to 25 after smoke cleanup)
Discord post test to #memory-review (channel 1483958144596967464):
posted=True, format renders correctly with all four sections.
Cron entries added to ~/.hermes/cron/jobs.json:
Hermes Weekly Memory Promotion - 0 19 * * 6 (Sun 03:00 UTC+8)
Hermes Weekly Memory Apply - 0 19 * * 0 (Mon 03:00 UTC+8)
Both enabled, deliver=discord, script-driven (wake-gate=false).
Tests: 17 new for promotion (prompt placeholders, hard-rule presence,
candidate / neighbor formatting, digest_id format, WeekDigest round-trip,
markdown renders all 4 sections, empty-section collapse, no-candidates
short-circuit, dry-run no-write, real-run persists diff, no-pending-diff
exit, rejection sentinel archives without applying, promote inserts +
mirrors to vec_facts + stamps episodes, dedup bumps hits, expire sets
valid_to, purge_old_pending). 222/222 green across all memory + memdebug
+ import + scripts tests.
Operational notes:
- Kimi-K2-Thinking unavailable on synthetic.new (404) - we auto-fallback
to Kimi-K2.5 with temp=0.2. Quality looks acceptable; revisit if
promotion misses obvious dedup opportunities.
- The hot-path write_episode keeps stashing long-lived facts into
episodes.metadata.stashed_facts, so the first real Sunday firing on
a chocoprod week will draw from real data.
Refs liyoungc/hermes-memory#10
The hermes scheduler hard-binds ~/.hermes/scripts/ as the only exec path for cron jobs, so the runtime copies must live there per-host. Keeping the canonical sources in the repo means PR review can see them and a fresh chococlaw rebuild is a 2-line cp + jobs.json patch. Refs liyoungc/hermes-memory#10
plugins/memreview/ is a standalone slash-command plugin registering
two commands per spec §7.1:
/memreview reject <digest_id> - writes
~/.hermes/memories/pending_diffs/<digest_id>.rejected
Monday's weekly_apply reads this sentinel and archives the diff
without applying any of its promote / dedup / expire actions;
candidate episodes stay unstamped for next Sunday's window.
/memreview pending - lists all pending digest_ids,
flagging any that already carry a
rejection sentinel.
/mem off - global kill switch. Writes
HERMES_HOME/MEM_OFF. Both
SqliteVecMemoryProvider.sync_turn
(hot path) and weekly_promotion
(cold path) check for this file at
the top of each call and short-
circuit. Read path is unaffected.
/mem on - removes the sentinel.
/mem status - human-readable state of the kill
switch + pending diff list.
Why slash commands rather than Discord reactions: spec §7.1 explicitly
chose slash because reactions don't reliably trigger webhook events
across all bot adapters — a silent kill-switch failure is worse than
no switch.
Sentinel file design rationale: file-system state (rather than in-memory
process flags) survives container restart, cross-thread visibility
without locks, and gives the user a manual recovery path
(touch / rm the file directly).
Wired into the write paths:
- plugins/memory/sqlite_vec/__init__.py: sync_turn now checks
_mem_off_active() before scheduling the write_episode worker.
bump_hits still fires (it's read-side accounting).
- plugins/memory/sqlite_vec/promotion.py: weekly_promotion checks
mem_off_active() at the top of the function and returns a
"skipped: /mem off active" summary without reading episodes,
calling Kimi, or persisting any diff.
Both call sites import lazily from plugins.memreview so the memory
plugin still loads cleanly even if memreview is uninstalled.
Tests: 15 new (help text, pending list with/without rejected flag,
reject invalid/unknown/valid digest_id, /mem off+on creates/deletes
sentinel, /mem on idempotent, /mem status with and without pending,
register() wires both commands, end-to-end reject -> apply archives
without applying, /mem off short-circuits weekly_promotion before
Kimi is called). 522/522 green across all plugin tests.
Live verification on chococlaw:
1. wrote fake pending diff wk-2026-05-02.json (with a "should NEVER land"
promote entry).
2. /memreview pending — listed it.
3. /memreview reject wk-2026-05-02 — sentinel created, confirmation reply.
4. weekly_apply — archived as wk-2026-05-02.rejected.json, sentinel
auto-cleaned. semantic_facts unchanged (25 -> 25). The promote was
correctly discarded.
5. /mem off / status / on cycle — sentinel toggled at /opt/data/MEM_OFF.
Refs liyoungc/hermes-memory#11
Idempotent bash script that performs the W4 cutover steps when run
with --commit. Default invocation is dry-run.
Steps:
1. Pre-flight (verify memory.db exists, recent episodes accumulated)
2. Archive ~/.hermes/memories/MEMORY.md → MEMORY.md.archive-YYYY-MM-DD
(chmod 444 for read-only)
3. Confirm config.yaml memory.provider == sqlite_vec
4. Disable legacy memory crons (Dimensions Memory Consolidation,
Forgetting Curve) by flipping enabled=false in jobs.json
5. Smoke test the new provider end-to-end
6. Restart gateway
Spec target date 2026-05-24, after observing one successful weekly
review cycle. Caller is the user; script is non-destructive in dry-run
mode and refuses to overwrite existing archives so re-running mid-fail
is safe.
Rollback procedure documented in hermes-memory/docs/runbooks/memory-rollback.md §3.
Refs liyoungc/hermes-memory#12
Owner
Author
|
Note (2026-05-02): The implementation introduced by this PR has been extracted to a dedicated plugin repo — see liyoungc/hermes-memory-plugin. The 25 files added here have been removed from this fork in #4. The git history of how we got here is preserved (this PR + its merge commit), but the current state of |
liyoungc
added a commit
that referenced
this pull request
May 2, 2026
Removes the 25 implementation files merged in #2 (W1-W4-2). The code now lives in a dedicated repo (liyoungc/hermes-memory-plugin) installed via: git clone git@github.com:liyoungc/hermes-memory-plugin.git ~/Projects/hermes-memory-plugin ~/Projects/hermes-memory-plugin/install.sh ~/Projects/hermes-agent The install symlinks plugins/memory/sqlite_vec, plugins/memdebug, and plugins/memreview from the plugin repo into hermes-agent/plugins/. For docker-based deploys, install.sh additionally writes a docker-compose.override.yml with bind mounts so the running container picks up live edits without an image rebuild. Why extract: - git pull upstream/main on this fork is now trivial again (no merge conflicts) - Plugin code can be installed on a vanilla NousResearch fork - Spec edits and prompt iterations land in one place DB at ~/.hermes/memories/memory.db is untouched. Cron jobs in ~/.hermes/cron/jobs.json migrate via install.sh. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements W1 + W2 + W3 + W4 (prepped) of the Hermes V3 long-term memory design.
W1 — schema bootstrap (done)
plugins/memory/sqlite_vec/registering as aMemoryProviderplugin:episodes(hot tier) +semantic_facts(cold tier) +vec_facts(vec0 virtual table) + 3 sync triggers.W2-1 — read path + embedding wrapper (done)
embed.py: asyncvoyage_embed()(httpx, 128 batch, 3× backoff retry, locked dim/dtype 512/int8).read.py:Factdataclass + asyncread_memory()(vec0 prefilter k=50, SQL CTE rerank0.7*sim + 0.3*exp(-age/90)), p95 logged.bump_hits()fire-and-forget.format_facts_for_prompt()withwith_metaflag.W2-2 — MEMORY.md import (done)
scripts/import_md.py: parsesTopic: content §, slugifies hierarchy, preserves CJK, idempotent, atomic, --dry-run / --commit.W2-3 — wire prefetch + sync_turn (done, live on chococlaw)
prefetch()runs in worker thread (5s timeout) returning the recall block.check_same_thread=False+ per-provider lock for cross-thread sqlite3.config.yaml memory.provider: sqlite_vec(no env-var gate).W2-4 — /memdebug slash command (done, live)
plugins/memdebug/standalone plugin, registers/memdebug <q>and/memdebug rawsearch <q>. Logs invocations to memory.log.W3-1 — kimi_extract + EXTRACT_PROMPT (done)
plugins/memory/sqlite_vec/extract.py: PROMPT verbatim from spec §5.2, PHI_BLACKLIST_CHANNELS short-circuit, tolerant JSON parser (handles 3 different Kimi output shapes observed in live testing).W3-2 — write_episode + sync_turn write-back (done, live)
plugins/memory/sqlite_vec/write.py: per-turn write back, fast-track threshold 30d, JSONL failure log.sync_turnafterbump_hitsvia worker thread (30s timeout). msg_id synthesized via hash for idempotency.W3-3 — weekly_promotion + weekly_apply (done, live, cron-scheduled)
plugins/memory/sqlite_vec/promotion.py: PROMOTION_PROMPT designed; weekly_promotion + weekly_apply async; render_digest_markdown matching spec §5.4; discord_post helper with chunking.scripts/cron/weekly_promotion.py+weekly_apply.pythin wrappers (deployed to~/.hermes/scripts/).0 19 * * 6(Sun 03:00 UTC+8) +0 19 * * 0(Mon 03:00 UTC+8).W3-4 — /memreview reject + /mem kill switch (done, live)
plugins/memreview/:/memreview reject <digest_id>writes sentinel;/mem off|on|statustogglesMEM_OFFglobal kill switch.sync_turn(skip write_episode) andweekly_promotion(skip Kimi call). Read path unaffected.W4-1 — cutover prep (prepped, awaiting soak)
scripts/cutover/cutover.sh: idempotent bash script, dry-run by default. Archives MEMORY.md, disables legacy crons, smoke tests, restarts gateway.W4-2 — runbooks (done, in hermes-memory repo)
docs/runbooks/memory-rollback.md: per-week / per-month / full-W4 / failure-diagnostic procedures.docs/runbooks/memory-monitoring.md: daily / weekly / monthly / quarterly health-check plan.W1 schema fixes bundled across W2-W3
vec_factswasFLOAT[512]→ changed toint8[512](W2-1). vec0 INSERT requiresvec_int8(blob)wrapper; UPDATE rejected on int8 even with wrapper, so trigger rewritten as DELETE+INSERT.distance_metric=cosine(W2-2).LOG_PATH = Path.home()resolves to/home/hermesinside container (not the/opt/datamount); switched tohermes_constants.get_hermes_home()(W2-4).check_same_thread=False+ per-provider lock (W2-3)._apply_diff_atomicwas holding BEGIN open across Voyage HTTP — embed BEFORE BEGIN (W3-3).End-to-end live verification (chococlaw)
MemoryManager.prefetch_allreturns full markdown block/memdebug× 3#memory-review(channel1483958144596967464).rejected.json, semantic_facts unchangedTests
522/522 green across the W1-W3 surface in container:
Including: W1 schema (7), W2-1 read path (10), W2-2 import_md (12), W2-3 prefetch wiring (6), W2-4 memdebug (10), W3-1 extract (22), W3-2 write_episode (11), W3-3 promotion (17), W3-4 memreview (15) = 110 new tests plus existing sibling-plugin coverage that we did not regress.
Spec references
Notes for review
open_db/init_dbgained a keyword-onlycheck_same_threadparam (default True; only the provider passes False) — backwards-compatible.